Skip to content

Conversation

@pblazej
Copy link
Contributor

@pblazej pblazej commented Sep 18, 2025

Adds 3 basic building blocks for simple(r) agent experiences:

  • Session - connection, pre-connect, agent dispatch, agent filtering (e.g. by name), all agents, messages (broadcasted and aggregated for now)
  • Agent - wrapper around Participant, knows its tracks and internal state
  • LocalMedia - (unrelated) helper to deal with local tracks in SwiftUI

Example: livekit-examples/agent-starter-swift#29

@pblazej pblazej force-pushed the blaze/agent-conversation branch from 6ea1621 to 2f9bbee Compare September 18, 2025 12:38
@pblazej pblazej force-pushed the blaze/agent-conversation branch 2 times, most recently from 94ec7d0 to e5caee2 Compare September 18, 2025 13:34
Copy link
Contributor

@bcherry bcherry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this generally looks good - lmk when the API is considered final

@pblazej pblazej force-pushed the blaze/agent-conversation branch from aa93417 to 212035c Compare September 23, 2025 12:02
@pblazej pblazej marked this pull request as draft October 1, 2025 08:40
@pblazej pblazej force-pushed the blaze/connection-provider branch from 51915ab to 0c89008 Compare October 2, 2025 08:10
Base automatically changed from blaze/connection-provider to main October 14, 2025 12:29
@pblazej pblazej force-pushed the blaze/agent-conversation branch from c52f944 to 9b16217 Compare October 15, 2025 09:01
@github-actions
Copy link

github-actions bot commented Oct 15, 2025

⚠️ This PR does not contain any files in the .changes directory.

@pblazej pblazej requested a review from 1egoman October 23, 2025 10:04
@pblazej
Copy link
Contributor Author

pblazej commented Oct 23, 2025

The state machines:

Simulator.Screen.Recording.-.iPhone.17.-.2025-10-23.at.12.10.15.mov

Copy link

@1egoman 1egoman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work on this @pblazej, looking forward to seeing this get merged!

@pblazej
Copy link
Contributor Author

pblazej commented Oct 23, 2025

Did one more fix - I noticed that there was a "gap" that we discussed in session vs agent preconnect states: a787606

Now it's kinda more obvious with case connecting(buffering: Bool) (not exposed as the whole State)

@pblazej pblazej force-pushed the blaze/agent-conversation branch 2 times, most recently from f9e3fbc to 3930acb Compare October 23, 2025 12:30
@pblazej pblazej force-pushed the blaze/agent-conversation branch from 3930acb to 7a71453 Compare October 23, 2025 13:06

private enum State {
case disconnected
case connecting(buffering: Bool)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question
what this |buffering| mean ? or prebuffering is what it means ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's equivalent to JS isBufferingSpeech so pre-connect buffer, it's not exposed anywhere, but we can rename it

mutating func connecting(buffering: Bool) {
log("Agent connecting from \(state)")
switch state {
case .disconnected, .connecting:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question
should you allow .connected here ? that might means reconnect ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there's no .reconnecting state in the agent itself, this overall Agent.State is a little artificial as it's derived partially from the room, as mentioned above: #789 (comment)

mutating func connected(participant: Participant) {
log("Agent connected to \(participant) from \(state)")
switch state {
case .connecting, .connected:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick

should you handle the noOp operation ? like adding a

private func assign(_ new: State) {
guard new != state else { return }
state = new
}

Then use
assign(.connected(agentState: participant.agentState,
audioTrack: participant.agentAudioTrack,
avatarVideoTrack: participant.avatarVideoTrack))

Copy link
Contributor Author

@pblazej pblazej Oct 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the perspective of UI framework, it does not really matter (for perf) https://medium.com/airbnb-engineering/understanding-and-improving-swiftui-performance-37b77ac61896 as we use the non-equatable (default) comparison for the State struct.


init(room: Room) {
self.room = room
room.add(delegate: self)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question

should this TranscriptionDelegateReceiver hold the room strong reference ? or it should be weak ?

Could you please confirm that there is no cycle reference if they hold each other strongly

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think Room does not keep a strong ref to the delegates

let delegates = NSHashTable<AnyObject>.weakObjects()

/// Creates a new message stream for the transcription delegate receiver.
func messages() -> AsyncStream<ReceivedMessage> {
let (stream, continuation) = AsyncStream.makeStream(of: ReceivedMessage.self)
self.continuation = continuation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question
Can this messages() will be called multiple times?
If yes, the current code stores a single continuation, calling messages() again overwrites it and could it leave the old stream hanging ?
In that case, I am a bit worrying that it never finish the stream either, so consumers can hang ?

And I wonder if we should have an explicit stop function like
func stop() {
room?.remove(delegate: self)
room = nil
continuation?.finish()
continuation = nil
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a general problem with AsyncStream<> being exposed here (without storing anything internally). It's not intended to be used by multiple consumers at all (known issue). There's also no equivalent AnyAsyncSequence (as AnyPublisher).

it never finish the stream either,

It can be cancelled from the outside like this:

let locations = AsyncLocationStream()

let task = Task {
    for await location in locations.stream {
        print(location)
    }
}

task.cancel()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re: stream idempotence, I think we've got 2 choices:

  • leave is as is - it won't register another consumer for this topic, just throwing StreamError.handlerAlreadyRegistered
  • unregister before registering, so the previous stream will stop working - I think that's the worst one

private func observe(receivers: [any MessageReceiver]) {
for receiver in receivers {
Task { [weak self] in
for await message in try await receiver.messages() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any chance that messages() will throw ?
If that is possible, this loop just exits quietly.

How about wrapping in do/catch and log so users know why a stream stopped.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

6846414 yeah I think it's doable

Task { [weak self] in
for try await _ in room.changes {
guard let self else { return }
updateAgent(in: room)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question
curiously, when room gets changed, do we cancel the current tasks ? or do we need to worry about the cancellation for disconnect ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the room is immutable public let room: Room, I don't see a reason for cancelling on disconnect while we wanna observe connection state as well: connectionState = room.connectionState

for try await _ in localParticipant.changes {
guard let self else { return }

microphoneTrack = localParticipant.firstAudioTrack
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This localMedia class is @mainactor,
I think you should run code on the main thread:
await MainActor.run {
self.microphoneTrack = localParticipant.firstAudioTrack
self.cameraTrack = localParticipant.firstCameraVideoTrack
self.screenShareTrack = localParticipant.firstScreenShareVideoTrack
self.isMicrophoneEnabled = localParticipant.isMicrophoneEnabled()
self.isCameraEnabled = localParticipant.isCameraEnabled()
self.isScreenShareEnabled = localParticipant.isScreenShareEnabled()
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're compiling in Swift 6 mode with strict concurrency, so it would not compile if wrong:

image

Non-detached Task inherits its actor context implicitly.

What would not compile is Task.detached {}

}
}

Task {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as these code touches the self properties, use @mainactor [weak self] in

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as above #789 (comment)


guard let cameraCapturer = getCameraCapturer() else { return }
let captureOptions = CameraCaptureOptions(device: videoDevice)
_ = try? await cameraCapturer.set(options: captureOptions)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if this cameraCaptureer.set fails, should we still set the selectedVideoDeviceID ?

How about
guard let capturer = getCameraCapturer() else { return }
do {
try await capturer.set(options: .init(device: videoDevice))
await MainActor.run {
self.selectedVideoDeviceID = videoDevice.uniqueID
}
} catch {
self.error = .mediaDevice(error)
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep 99e74b8

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW there's an issue to improve this API overall #177

@xianshijing-lk
Copy link
Contributor

looks good. I have some questions / comments, please address them.

@pblazej
Copy link
Contributor Author

pblazej commented Oct 24, 2025

@xianshijing-lk thank you for looking into that!

I think I added missing error handlers.

The only thing that is a tradeoff and cannot be easily fixed is the AsyncStream in the public API, I think the limitations of that (single consumer) are already known to the community.

I did not want to revert it to Combine or callbacks, to keep the cancellation semantics and everything related to AsyncSequence.

Copy link
Contributor

@xianshijing-lk xianshijing-lk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm.

@pblazej pblazej merged commit 845aee2 into main Oct 27, 2025
47 of 49 checks passed
@pblazej pblazej deleted the blaze/agent-conversation branch October 27, 2025 09:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants